A common algebraic description for probabilistic and quantum 

computations* 



o 



Martin Beaudry^ Jose M. Fernandez Markus Holzer* 

Universite de Sherbrooke Universite de Montreal Technische Universitat Miinchen 

O 
O 

<N 

\ Abstract 

■ We study the computational complexity of the problem SFT (Sum-free Formula partial Trace) : given a tensor 

^sO , formula F over a subsemiring of the complex field (C, +, •) plus a positive integer k, under the restrictions that 

all inputs are column vectors of I_2-norm 1 and norm-preserving square matrices, and that the output matrix is a 
column vector, decide whether the k th partial trace of FFt is superior to 1/2. The k th partial trace of a matrix is 
' the sum of its lowermost k diagonal elements. We also consider the promise version of this problem, where the 

^sO , 1 /2 threshold is an isolated cutpoint. We show how to encode a quantum or reversible gate array into a tensor 

formula which satisfies the above conditions, and vice-versa; we use this to show that the promise version of SFT 
is complete for the class BPP for formulas over the semiring (Q + , +, ■) of the positive rational numbers, for BQP 
in the case of formulas defined over the field (Q, +, •), and for P in the case of formulas defined over the Boolean 
{SJ , semiring, all under logspace-uniform reducibility. This suggests that the difference between probabilistic and 

quantum polynomial-time computers may ultimately lie in the possibility, in the latter case, of having destructive 
"p^ , interference between computations occuring in parallel. 

Oh! 
-i— > ■ 

1 Introduction 

CT 1 . 

The "algebraic approach" in the theory of computational complexity consists in characterizing complexity classes 
within unified frameworks built around a computational model or problem involving an algebraic structure (usually 
finite or finitely generated) as the main parameter. In this way, various complexity classes are seen to share the same 
definition, up to the choice of the underlying algebra. Successful examples of this approach include the description of 
NC and its subclasses AC and ACC° in terms of polynomial-size programs over finite monoids and analogous 
results for PSPACE, the polynomial hierarchy and the polytime mod-counting classes, through the use of polytime 
leaf languages p4|]. A more recent example is the complexity of problems whose input is a tensor formula, i.e. 
a fully parenthetized expression where the inputs are matrices (given in full) over some finitely generated algebra 
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and the allowed operations are matrix addition, multiplication, and tensor product (also known as outer, or direct, or 
Kronecker product). Depending on the semiring over which the formula is defined, the problem of deciding whether 
the output matrix contains at least one nonzero entry is complete for NP (Boolean semiring) and MOD q -P (modulo 
semiring Z q ) [0]. Other common-sense computational problems on tensor formulas were analyzed in [0, B]. 
Tensor formulas are a compact way of specifying very large matrices. As such, they immediately find a potential 
application in the description and the behavior of circuits, be they classical Boolean, arithmetic (tensor formulas over 
the appropriate semiring) or quantum (formulas over the complex field, or an adequately chosen subsemiring thereof). 
In this paper, we formalize and confirm this intuition, in that we define a meaningful computational problem over 
tensor formulas which enables us to capture the significant complexity classes P, BPP, and BQP. Looking at variants 
of the problem enables us to capture further complexity classes; a table in the last section summarizes our results. 
Apart from offering a first application of the algebraic approach to quantum computing, our paper reasserts the point 



made by Fortnow [|12|], that for the classes BPP and BQP, the jump from classical to quantum polynomial-time 
computation consists in allowing negative matrix entries for the evolution operators, which means the possibility of 
having destructive interference between different computations done in parallel. 

2 Background on circuits and complexity 

We use standard notions and notations from computational complexity, see for example |2(]]. In particular we 
assume that the reader is familiar with the deterministic and probabilistic Turing machine models, with the usual 
notion of a Boolean circuit, and with logspace many-one reducibility: a set K is logspace time many-one reducible to 
a set L if there is a logspace computable mapping f such that for all x, x G K iff f (x) G L. 

To handle the three types of computation discussed in this paper (deterministic, probabilistic and quantum), we use 
gate arrays as a common setting. From now on, we reserve the word circuit to the traditional idea of an acyclic 
network with a unique output bit, and we use gate array to describe those computational networks which satisfy the 
following definition. 

Definition 2.1. Let n, d > 1. A width n, d-leveled gate array is an X d array where each line is called a wire and 
each column a level. The size of a gate array is the number nd. A gate is a set of array entries from the same level 
(corresponding to the wires involved in the gate's operation) together with a square matrix which describes its action. 
Gates on a given level act on disjoint sets of entries from this level. Let the levels be numbered 1 to d from left to 
right. Each wire carries a bit from a level to the next in the left-to-right direction; the value entering column 1 from 
the left is called an input the value exiting level d to the right is an output. 

A gate of k binary inputs operates on the set of k-bit vectors by mapping each of the 2 k possible combinations of 
input values to a combination of output values. The extra constraint, that all gates act on neighboring wires, can be 
enforced on an arbitrary array at the cost of inserting a quadratic number of extra levels with "swap" gates, which 
interchange the values carried by two adjacent wires. 

Gate arrays are used in particular to describe reversible classical computations. A computation is reversible iff knowl- 
edge of its output is sufficient to be able to deterministically reconstruct the input. It has been shown that for any 
polynomial-time deterministic algorithm there exists an equivalent polynomial-time reversible algorithm; in other 



words, from every polynomial-size Boolean circuit an equivalent reversible gate array [ 13] can be constructed, by 
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• modifying the circuit so that the numbers of input and output bits are equal; 

• replacing the usual one-output gates with reversible gates; 

• making sure that an especially identified "decision" bit takes value 1 at the output level iff the original circuit's 
output is 1 . 

From the description of the original circuit, its equivalent reversible gate array can be constructed in deterministic 
logspace; circuit size and depth are increased only by a polynomial factor; usually, a polynomial number of extra 
input bits initialized at 0, called ancillary bits, also has to be added in the process. It has been shown that this array 
can be built solely with the one- and two-bit reversible operations, plus either one of the "Toffoli" (0) or "Fredkin" 
(O) gates, where 
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here the top left position corresponds to bit values 000 and the bottom right to 1 1 1 . 

Standard techniques can therefore be used in sequence to transform the description of a polytime deterministic Turing 
machine and its input x into an instance of the Circuit Value Problem with constant inputs (where x is hardwired) 
1 16], then to turn this circuit into a reversible gate array, in order to give the following definition for the class P. 
(Alternatively, one can start from the definition of P as the class of those languages decided by logspace-uniform 
families of polynomial-size Boolean circuits.) 

Definition 2.2. P is the class of those languages L C L* for which there exist a logspace-computable function which, 
given an input x E L*, computes the encoding of a reversible gate array C(x) with constant inputs, whose decision 
bit takes value 1 at the output level //fx G L. 

An encoding for C(x) is suitable for this definition if it consists of a reasonable description of the array's inputs, 
wiring and gates; the latter can wlog be restricted to have constant fan-in/fan-out, so that the action of each gate can 
be specified with a constant-size Boolean matrix. 

Complexity classes for polynomial-time probabilistic computation are usually defined in terms of a polytime Turing 
machine which picks a random bit at every step of its computation, and otherwise acts deterministically (see e.g. [§]). 
An equivalent circuit is built from this Turing machine and its input, in which an appropriate number of random bits 
are fed in alongside the (constant) input bits; whether the input belongs to L is verified by counting those combinations 
of random bits for which the output bit takes value 1 . All random bit combinations have equal length and are equally 
likely. 

Definition 2.3. PP is the class of those languages L C L* for which there exist a logspace-computable function 
which, given an input x G X*, yields the encoding of a reversible gate array C(x) with a combination of constant and 
random inputs, such that x E L iff fcM > \ an d x L iff ~fcM < \> where fcM denotes the probability that 
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C(x) 's decision bit takes value 1 at the output level. 

BPP is defined with the extra condition that there exists a parameter e, < £ < \, such that x £ LifficM > \ + £ 
value 1 at the output level. 

The class NP can he similarly defined, with the condition that x € L iffi <z{x] > 0- 

The definition of BPP includes the implicit constraint, that the proportion of accepting computations can never fall 
inside the interval [4 — e, X + e]; in other words, X is an isolated cutpoint. Note that both PP and BPP can be redefined 
with a cutpoint other than j. 



Polynomial-time quantum computation was defined originally in terms of quantum Turing machines [@]: the data 
handled by this machine (qubits) are formally represented as a vector whose complex components give the distribution 
of amplitudes for the probability that the qubits be in a certain combination of values; each transition of the machine 
acts as a unitary transformation on this vector. 



It was later shown [ ]21[ ] that a quantum Turing machine and its input can be encoded in deterministic polynomial 
time into an array of quantum gates, if one is allowed a small probability of error. Each wire in a quantum gate 
array represents a path of a single qubit (in time or space, forward from left to right), and is described by a state in a 
two dimensional Hilbert space with basis |0) and [1). Just as classical bit strings can represent the discrete states of 
arbitrary finite dimensionality, so a string of n qubits can be used to represent quantum states in any Hilbert space of 
dimensionality up to 2 n . The action of a gate of k inputs is a unitary operation of the group U(2 k ), i.e., a generalized 
rotation in a Hilbert space of dimension 2 k . It has been shown that a small set of one- and two-qubit gates suffices to 
build quantum arrays, in that any n-qubit gate can be simulated by a subarray consisting of two-qubit gates, and the 
number thereof is at most an exponential in n (see for example [||, ^, 18, |l7|]). As two-qubit gates it suffices to take 
the controlled-not N. Because of its usefulness we also mention the two-qubit "swap" gate T. 



N 



The vector of qubits received as input by a quantum gate array can be regarded as a linear combination of pure states. 
There is a measurement done on the array's output, which consists in projecting the output vector onto a subspace, 
usually defined by setting a chosen subset of the qubits to |1) ("accepting subspace"). If the qubits are numbered 1 
to n, then a k-qubit accepting subset can be chosen to be qubits 1 to k, at the cost of inserting a quadratic number of 
extra swap gates. For the sake of simplicity, we can assume that the final output state will be such that all qubits other 
than the decision qubit have value |0). This is without loss of generality, as it will be possible to "uncompute" the 
circuit while keeping the value of the decision bit. Thus, the accepting subspace has dimension 1 , and contains only 
one base vector, and similarly for the rejecting subspace. 

Definition 2.4. BQP is the class of those languages L C L* for which there exist a logspace-computable function 
which, given an input x £ L*, yields the encoding of a quantum gate array C(x) with constant inputs, and a parameter 
£, < e < j, such that x £ L iff i cM > \ + £ and x L iff ~fci x ) < \ ~ £ > where fc(x) denotes the probability 
that the qubits ofC[x) be projected onto the accepting subspace at the output level. 
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The remark on parameter e made after the definition of BPP also holds here. The definition of BQP still holds if we 
restrict the gates to implement unitary operators with entries taken in a small set of rationals [|I|], and to determine 
acceptance or rejection by the value of a single qubit [Q]. 



The same definition, with unitary operators and input vectors having rational entries and without the condition that X 
be an isolated cutpoint, yields a "quantum" version of the (classical) class PP. However, this "new" class is in fact no 
different than PP itself, as can be shown by a simple counting complexity theory. 

For any language L in this class, there exists a quantum circuit that accepts it, for which we can define the non- 
negative functions f(x) and g(x), as the sum of all the positive and negative contributions, respectively, to the total 
amplitude for the accepting configuration on a given input x. The amplitude of this unique accepting configuration 
is f(x) — g(x). Similarly, define f'(x) and g'(x) for the rejecting configuration, with the corresponding rejecting 
amplitude being f '(x) — g'(x). It is easy to see that f, g, f , and g' are all #P functions. The difference between the 
probability of accepting and rejecting of this circuit is thus 

(f - g) 2 - (f - g') 2 = f 2 + g 2 + 2f V - (f 11 + g' 2 + 2fg) 

which is a Gap? function, since #P is closed under (finite) sum and product. This function will be positive if and only 



x is in L, which is another way of characterizing languages in the class PP []1 

On the other hand, the languages defined with quantum gate arrays where unitary operators have rational entries and 
such x G L iff fc(x) > form the complexity class NQP, the quantum analogue to NP, which coincides with the 
(classical) class coC = P JT0|]. 



3 Tensor Algebra 



A semiring is a tuple (K, +, •) with (0,1}CI and binary operations +, • : K x K — > K (sum and product), such that 
(K, +, 0) is a commutative monoid, (K, •, 1 ) is a monoid, multiplication distributes over sum, and • a = a • = 
for every a in K (see, e.g., [ 15]). A semiring is a ring if and only if (S, +, 0) is a group. In this paper we consider 
the following semirings: the Booleans (B, V, A), the field of rational numbers (Q, +, •), the semiring (Q + , +, •) of 
positive rational numbers, and the field of complex numbers (C, +, •)• 

Let M K denote the set of all matrices over K, and define Mj^ C M k to be the set of all matrices of order kxt 
Let [k] denote the set {1 , 2, . . . , k}; for a matrix A in and (i, j ) G [k] x [i] , the (i, j ) th entry of A is denoted by ay 
or (A)ij. Addition and multiplication of matrices in M K are defined in the usual way. Additionally we consider the 

I K — ) M K of matrices, also known as Kronecker product, outer product, or direct product, 



tensor product : M K x M K — > 
which is defined as follows: for A G M^ ,f and B G 



I™- n let A i 



B G M^ tn be 



A <g> B 



0-1,1 • B 
Qk,i • B 



Qi,« • B 
o-ki ■ B 



Hence (A <g) B)ij = (A) q>r • (B) Sjt where i = k • (q — 1 ) + s and j 



n+t. 
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The following notation is used: let I n be the order n identity matrix, ej 1 the n x 1 column vector whose i th entry has 
value 1 and the others 0. and let A® n stand for the n-fold iteration A (g) A tg> ■ ■ ■ tg> A. 

Stride permutations, which play a crucial role in the implementation of efficient parallel programs for block recursive 



algorithms such as the fast Fourier transform (FFT) and Batcher's bitonic sort (see []19[]) will be useful in our proofs. 
The mn-point stride n permutation P™ n e MJ^ n,mn is defined as 

where e™ € M™' 1 and eT 1 e MJj' 1 . In other words, the matrix P™ n permutes the elements of a vector of length mn 
with stride distance n. We will make use of the following identities on stride permutations. 

Proposition 3.1. The following holds for all i, m, n: 

1. (P mn r 1 = pmn. 

7 p£mn _ p£m.n . pfmn. 

3. p^^fpfr^ij-d^pr). □ 



3.1 Tensor formulas 

Definition 3.2. The tensor formulas over a semiring K. and their order are recursively defined as follows. 

1 . Every matrix f from M^ ,£ with entries from K. jj a (atomic) tensor formula of order k x I. 

2. Iff and G are tensor formulas of order k x I and m x n, respectively, then 

(F + G) is a tensor formula of order iskx I ifk = m and I = n; 
(F • G) is a tensor formula of order k x n if I = m; 
(F (g) G) is a tensor formula of order km x In. 

3. Nothing else is a tensor formula. 

We say that a tensor formula F is sum-free whenever none off and its subformulas has the form G + H. Let T K denote 
the set of all tensor formulas over K, and define Tj|' f C T K to Z?e f/ie set of all tensor formulas of order k x t 

In this paper we only consider semiring elements whose value can be given with a standard encoding over some finite 
set Q. Input matrices can therefore be string-encoded using list notation such as "[[001] [101]]." Nonatomic tensor 
formula can be encoded over the alphabet L = Q U { [,],(,),-,+, f8>}. Strings over L which do not encode valid 
formula are deemed to represent the trivial tensor formula of order 1x1. 

The size of a tensor formula F is 1 if F is atomic, otherwise F = G o H for o E {+, •,(£>} and the size of F is 1 plus 
the sizes of G and H. The diameter of tensor formula F, denoted by |F|, is max{k, £} if F is atomic of order k x t; 
otherwise we have that F = G o H is of order k x I, and |F| = max{k, t, \ G\, |H|}. 

It will sometimes be convenient to speak of a tensor formula in graph-theoretical terms: in this context, a tensor 
formula is a binary tree whose edges are directed toward the root ("output node"), whose leaves ("input nodes") are 
labelled with atomic formulas and each of whose interior nodes is labelled with an operation from the set {+, - ,(8)}. 
The depth of a tensor formula is the maximum root-leaf distance. 

Definition 3.3. For each semiring K and each k and each I we define val^' C : Tj|^ — > Mj|' C , that is, we associate with 
node f of order k x I of a tensor formula F its k x I matrix "value," which is defined as follows: 
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1 . valj|' £ (f ) = F iff is an input node labeled with F, 

2. valj|' £ (f ) = valj^(g) + valj^(h) iff = (g + h), 

3. v< £ (f) = val£ m (g) • valj?-<(h) iff = (g ■ H) , and 

4. val^(f) = v4 /m ' £/n (g) ® val^(h) ,/f = (g ® h) . 

5. For completeness, recall that val^' (f ) = whenever the formula is not valid. 

The value valJ^(F) of a tensor formula F of order k x i is defined to the value of the unique output node. 

3.2 The sum-free partial trace problem 

A column vector v with complex coefficients is a unit vector iff its L2-norm is 1 , that is, iff v^v = 1 . In this paper, 
we work on probabilistic and quantum computations where the probability amplitudes are encoded in unit column 
vectors, and the foremost requirement on the computing model is that the inner product (hence also the I_2 norm) 
be preserved at each step of a computation. The action of each such step on the various combinations of values 
transported by the wires is described with a square matrix; our requirement is equivalent to asking that each matrix 
preserves the inner product {unitary matrices). 

A square matrix M over the complex numbers is unitary iff WO = M . For a matrix M over the real numbers, this 
translates into M T = M _1 ; which means that M is orthogonal. It is an easily verified fact that an orthogonal matrix 
contains only nonnegative entries if, and only if, it is a permutation matrix (i.e., exactly one entry per line and column 
is 1 and all others are 0). 

In the sequel, whenever we deal simultaneously with the cases where matrices with real or complex coefficients, we 
use the notations and vocabulary from the real case alone, in order to make the text easier to read. 
The trace of a square matrix is the sum of its diagonal elements ; for k > 0, its k ?/l partial trace is the sum of its last k 
diagonal elements, counting upwards from the lower right corner. For completeness, if k exceeds the diameter of the 
matrix, then the k th partial trace coincides with the usual trace. 

Definition 3.4. A sum-free tensor formula is OSL if and only if it satisfies the conditions: 

• all inputs are orthogonal square matrices and/or unit column vectors; 

• the output matrix is a column vector. 

(We choose the term "orthogonal-system-like" because as we will show, such a formula can be reorganized as a 
product M • V of an orthogonal matrix with a column vector, i.e. as the specification of an orthogonal system of linear 
equations.) 

Definition 3.5. Let K be a finitely generated semiring. An instance of problem SFT(K) ("sum-free formula partial 
trace") consists of an order N x 1 OSL tensor formula F over semiring K and a positive integer k; the problem consists 
in deciding whether the k th partial trace of (val^' 1 (F)) • (val^' 1 (F)) is greater than some predetermined constant 
oc, 1/2 < a < 1. In the "promise version" ofSFT(K), no instance can yield a k th partial trace which evaluates in 
the interval [1 — a, a]. 

We also define a "nonzero version" to SFT[K), as the problem which consists in deciding whether the k th partial 
trace of (val™' 1 (F)) • (val^' 1 (F)) is nonzero. 
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The following propositions show that basic questions on inputs for problem SFT(K) can be answered in polynomial 
time. 

Proposition 3.6. |Q] If F is a tensor formula of depth d which has input matrices of diameter at most p, then |F| < p 2d , 
and there exists a formula which outputs a matrix of exactly this diameter. (Proof omitted.) 

Proposition 3.7. ||5|] Testing whether a string encodes a valid tensor formula and if so, computing its order, is feasible 
in deterministic polynomial time. (Proof omitted.) 

4 From gate arrays to tensor formulas to gate arrays 

In this section we show how to encode the description of a reversible or quantum gate array into a OSL tensor formula 
over the appropriate semiring, and conversely, how to compute from an OSL formula F a gate array which will later 
used as a mean to solve an SFT instance built from F. 

4.1 From arrays to formulas 

Lemma 4.1. Let C be a gate array operating on u wires, whose gates can be described with orthogonal matrices 
over semiring K. There is a logspace computable function which, given a suitable coding of C, computes a tensor 
formula F(C) of logarithmic depth such that for each x = (xi , . . . , x n ) G {0, 1} n , 

C(x)=val^ 1 (F(C)-d x ), 
where d x = Xi> Xi = e \ tf x ^ = ®> an< ^ Xi = e 2 otherwise. 

Proof. Let C have m levels and let C\ denote the i th level, with Ci the left-most and C m the right-most. We describe 
how to construct an equivalent tensor formula M(C) from C assuming that and 1 are encoded by e\ and e|, respec- 
tively (for quantum arrays, that |0) and |1 ) are encoded by e\ and e|, respectively). We distinguish two cases. 

(?) If each gate of Ct acts on consecutive wires, that is, if Q contains I > 1 gates Hi , . . . , H^, acting on wires )-\ to 
ki, . . ., )i to k^, with ji < ki <)z" ' 1 < h ^ then 

M(Ct) = (if 1 " 1 8) Hi ®lf 2 ~*-' 1 ®---®H £ ®lf n - kt ) 
is the orthogonal matrix of order 2 n x 2 n describing the action of the i th level of C. 

(ii) If Ci contains gates acting on nonadjacent wires, then choose a permutation a of the wires which brings next 
to each other those wires which are involved in the same gate. Denote by D| the i th level reorganized in this way; 
its action on the (permuted) wires is described with a formula M(Dt) built as in case (i) above. The permutation is 
implemented by inserting between levels i— 1 and i extra depth levels consisting of swap gates, which are collectively 
described by a formula P a ; it is undone with other extra levels, inserted between i and i + 1 and described by P^i . 
Any permutation can be expressed as a product of a polynomial number of cycles of the form {) , j + 1 , . . . , k — 1 , k), 
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• 1 

2 

3 

4 

Figure 1: Simulating an arbitrary controlled-not by a controlled-not acting on neighboring wires. 

with ) < k; therefore it suffices to describe the formulas Pj )k (C) and Pj )k (C) which implement this cycle and its 
inverse, respectively. Pj ik (C) which implements its inverse. The reader can verify thatQ 

P j>k (C) = (if'- 1 ® T j)k ® If- k ) , where T j>k = fj (if ^ T ® if" 1 ) , 

i=i 

and 

k-j-1 

P j>k (C) = (lf- 1 ®T j , k ®lf- k ), where T j>k = ]J (if i ~ 1 ® T ® if ^) ; 

i=1 

with a = ((ji • • • ki) • • • 0£ • • • k^)) -1 , this yields 

Pa(C) =P jl)kl (C)---P jf , kf (C) and P -i(C)=P Jt , 1tt (C)---P jl> i tl (C), 

so that 

M(C i )=P _i(C).M(D i ).P ff (C). 
A sample construction for ) = 1 and k = 4 is depicted in Figure [l[ 
The complete tensor formula F(C) is given by 

m 

F(C) = []M(C i ), 
l=i 

which can be parenthesized in order to have logarithmic depth. It is readily verified that for each x£{0,1} n 

C(x)=val^(F(C)-d x ). 

Formula F(C) is logspace constructible from C: in particular, a permutation a suitable for case (ii) can be built by 
choosing a reorganization D| of level Ct in which the gates H-\ , . . . , H^, act on wires ji to k-|, . . ., )i to k^, such that 
1 = ji, ki + 1 = )2> + 1 =)i, then the cyclic decomposition of o has the form (1 ,2,3, . . . , h,i)(2,3, . . . , h.2) • • • 
where for each i > 2, the wires 1 , 2, . . . , i — 1 are left untouched by the i th cycle. □ 

'Note that according to the usual convention, the input-to-output direction in a gate array is left- to-right, while in its matrix representation, 
the array's action on its input is given as a product of orthogonal matrices with a column vector, and is read right-to-left. 
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4.2 From formulas to arrays 



In the formula-to-array part, one must deal with the fact that an OSL formula may contain matrices of various sizes, 
and column vectors at atypical locations. The latter may be regarded a nonstandard or disorderly manner of specifying 
the array's inputs. Matrices of nonstandard orders, however, cannot be readily interpreted in terms of Boolean or 
quantum computation: one may accept to work with many-valued bits and qubits, or the matrices may be padded in 
order to turn their orders into powers of 2, which is the option we choose in this paper. 

Lemma 4.2. There exists a polynomial-time algorithm which turns an OSL tensor formula F over semiring K into a 

formula 17(F) where all subformula sizes are powers of 2, and whose output is 

val^(F)- 
J ' 

where denotes a (possibly empty) null block. 

Proof. For an integer n > 0, let 7t(n) denote the smallest power of 2 greater than or equal to n. We also define a 
unary operator n which acts as follows on a matrix A: 

• if A is a n x n square matrix, then 7t(A) is a 7t(n) x n(n) block-diagonal square matrix consisting in a copy 
of A at the top left position and a copy of the identity matrix I n ( n )- n at the bottom right; 

• if A is a n x 1 column vector, then n{ A) is n[n) x 1 with the entries of A at the first n positions, and value 
in the n[n) — n others; 

• if A is neither of the above, then 7t(A) is undefined. 

Whenever A • B, 7t(A) and 7t(B) are defined, we have 7t(A • B) = 7t(A) • 7t(B), so that in the simple case where F does 
not contain any occurrence of the Kronecker product, FT(F) is built by replacing each atomic subformula of F with its 
image by n. 

This does not work in general. Consider for example the formula (A B) • (V <g W) where A and B are 33 x 33 
and 35 x 35, respectively, and V and W are 21 x 1 and 55 x 1, respectively: the orders of (7t(A) (g 7t(B)) and 
(7r(V) (g 7t(W)) do not match. There also exist cases where the orders match but the entries of (A® B) • (V(g W) are 
not consecutive in the column vector (7t(A) (g 7t(B)) • (7t(V) (g 7t(W)). Some subformulas may even yield matrices 
which are neither square nor column vectors. 

Nevertheless, we claim that if matrices 11(A) and 11(B) are available, then there exists permutations Q and Q' and a 
block H such that 



Q-(n(A)<gn(B))-Q / 



A <g B 
H 



where Q and Q ' can be specified with polynomial-size sum-free tensor formulas. (Note that H is orthogonal whenever 

both A and B are.) In the special case where both A and B are column vectors, Q' = Ii and the claim reads 

A<gB" 



Q-(TT(A)®TT(B)) 







We first show how to reorder the lines of IT (A) (g FT(B) where both A and B are column vectors. With A = 
[xi • • • x m ] T and B = [ y i • • • y n ] T , let yi = 2' > 7t(m), a = [i — m, -v = 2 k > 7t(n), and T = y — n. We start 
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with 

TT(A) = [xi • • -x m x m+ i • • -%] T , IT(B) = [-yi ■••y n Wi •••^] T 
and TT(A) ®TT(B) = [xvyi Xi,y 2 ■ ■ -x^ x 2 yi, x 2 y 2 ■ ■ ■ %V-v V] 
the Xi's and y^'s are the elements added by padding. Multiplying to the left with the stride permutation gives 

P^- (TT(A) ®TT(B)) = [xryi x 2 yn •••x^yi xi.y 2 x 2 y 2 ■ ■ ■ x^ ] T . 



Next we multiply with the matrix 



1X TL 



(N 







where N ^ T = I T ® P^ . The reader can verify that 

R{f • P^ • (n(A) ® TT(B)) = [ xiy 1 x iy2 • • • Xl y n • • • x m y n H ] T = [ (A ® B) H ] T 

where H is a size |rv — mn block whose first no entries are x m+ i,y i, , . . . , x^n and the other positions contain a 
permutation of xi.y n+1 , . . . ,xiy v , . . . , x^y-v. 

There remains to show how to build matrices P^ v and R^ with polynomial-size sum-free tensor formulas. By 
Proposition [O], it is readily verified that P^ = (Pt^)^ and that for any I > 1, the induction formula Pf +2 = 
(p 2 + ^ ® l 2 J ■ (l 2 e ® P 2 ) yields for the matrix P^ a quadratic-size tensor formula with input nodes for I 2 and P 2 4. 
Meanwhile, Rj^ = (Sn^) k , where 

" p7 o 

N^ T 



CM-"V 



In order to build this matrix, let 

andP^= (P| n ® I^-i ) • (In 1 
(U®l2)-l)- (i-v^P^) = 



u 



Pj n 

o i 2t 



?2 ) by Proposition 37 ; observe that 



Pf n ® I 



2>-i 

I 







In 



I n 



r 2 



Expressed in this way, matrix R n ^ can be built with a polynomial-size sum-free tensor formula, where matrix U is 
either given explicitly by a made-to-purpose gate if n is the diameter of an input matrix, or built inductively in the 
case where n = 7t(p) for some p, because in this case U = P| n . 

The same technique applies to reorder the lines for arbitrary matrices A and B; in this case the xt's and yt's are lines 
and each x^y^ in the above equations must be read as Xi ® yj. The claim for the existence of a matrix Q' which 
reorders the columns is proved in a dual manner. 

Let F be an OSL formula; the following algorithm builds a formula 11(F) which satisfies the conditions of the Lemma, 
by recursively defining TT(G) for each subformula G of F. 

• For each atomic subformula G, let 17(G) = 7t(G). 

• Repeat recursively from the leaves toward the root of F: for each subformula G = H o K for which FT(H) and 
TT(K) have already been computed and o e {•, ®}: 
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• if o is "®" then let TT(G) = Q • (11(H) <g) fl(K)) • Q' and insert the appropriate subformulas for Q and 
Q' (note that TT(Q) = Q and U(Q') = Q'); 

• otherwise o is if the orders of 11(H) and TT(K) match, then let 17(G) = 17(H) • H(K); else they differ 
by apower of 2 and the smaller matrix must undergo some padding, that is, either 17(G) = (if 1 ® 17(H)) • 
17(K), or 17(G) = 17(H) • ((e]) 0i ® 17(K)), for an appropriate i. 

□ 

Lemma 4.3. There is a poly time computable function which, from a OSL tensor formula F over semiring K, computes 
a polynomial-size gate array C(F) whose input is represented with a unit vector V, whose action over the inputs is 
given by an orthogonal matrix M, and such that matrices MV and valj^' 1 (F) satisfy 

MV-^(F)1 







where denotes a (possibly empty) null block. 



Proof. The formula 11(F) is used as a specification for a gate array C(F). For each atomic subformula G of F, either 
G is m x m for some m < |F|, where |F| is the diameter of F, and Ff(G) is interpreted as the specification of a gate 
with log2 7t(m) = [7og 2 m] inputs, or G is m x 1 and Ff(G) specifies the probability amplitudes for all possible 
combinations of values of log 2 7t(m) = |7og 2 m] input bits or qubits. In the former case, a polynomial-size array of 
elementary gates implements the operation specified by 11(G); in the latter case, a size m ' 1 ' array is built to take as 
input some constant unit vector (say e/j(m)) and yield as output the vector 11(G). Next, working recursively from the 
leaves toward the root of Ff(F), the interior nodes are interpreted as specifications for combining the subarrays either 
in a sequential (nodes labelled "•") or parallel (nodes labelled "(8)") manner. The resulting gate array has polynomial 
size and satisfies the conditions of the lemma. □ 



5 Complexity results 

Over the Boolean semiring, a column vector is a unit vector as soon as it is nonzero, so that the standard, promise and 
nonzero versions of problem SFT coincide. 

Theorem 5.1. Over the Boolean semiring, problem SFT is P-complete under logspace reducibility. 



Proof. Given a size n instance (F, k) of SFT(B), we use Lemma 4.3 to build an equivalent reversible gate array C(F) 
over N = n°' 1 ' bits, and we compute the output value of each of these bits (i.e. we solve N instances of the usual 
Boolean circuit value problem). This yields a combination of N values which corresponds to a given position along 
the diagonal of 

valf- 1 (F))-(valf. 1 (F)) T ) 
under the convention that combinations 00 • • • 0, . . ., 1 1 • • • 1 correspond to lines (and columns) 1 , . . . , 2 N , respec- 



tively. The hardness part consists in using Lemma |47j to reduce the P-complete circuit value problem Q16Q to an 



instance of SFT(B). □ 
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For the quantum and probabilistic cases we are mainly interested in the promise version of SFT, which gives us a 
striking description for the difference between complexity classes BPP and BQP. 

Theorem 5.2. The promise version of problem SFT(Q) is complete for the class BQP, under logspace reducibility. 



Proof. The hardness part is a generic reduction. Using Definition |2.3[ , we start with a m-leveled gate array Conn 
qubits numbered 1 to n whose accepting subspace is defined by setting qubit 1 to |1), and whose gates are defined 
with unitary matrices over Q. Denote by fc the probability that qubit 1 be projected to |1) when the measurement 



takes place. We use Lemma [PI to build from C an equivalent tensor formula F(C) = Y\a^\ M(Cj,). Meanwhile we 
define for the array's input qubits a tensor product V of n unit vectors of size 2x1. An easy induction on j shows that 

val^' 1 (nl =1 M(C0-v) 

is exactly the vector of amplitudes after level j in C. Thus the last 2 n ~ 1 entries along the diagonal of 

(v4 n . 1 (F(C)-V))-(v4 n . 1 (F(C)-V)) T 

add up to the value of f c, and the original array's input is accepted iff this partial trace exceeds the threshold by which 
acceptance by C was defined. Scrutiny of the reduction shows that the constraint on f c is transported intact from the 
description of C to the SFT(Q) instance F(C) -V. 

In the other direction, we use Lemma [Q| to translate an instance (F, k) for SFT(Q) into the description of a quantum 
gate array over m qubits, m > log 2 n, and of its inputs; the k th partial trace of 

(vaC^F^-^^F)) 1 

represents the probability that the output qubits of this array be projected onto the direct sum of the dimension- 1 

subspaces generated by |2 m - 1) = |2 m - 2) = |1 — 10>. [2 m - 3) = |1 — 01 > and [2 m -k). The 

promise on the partial trace is transported unmodified from the input tensor formula to the quantum gate array. □ 

The argument described above can be used to prove that the "standard" (non-promise) version of problem SFT(Q) 



is complete for PP, defined by removing the constraint from definition ^4j. Finally, when the proof is applied to the 



"nonzero" version of problem SFT(Q), a completeness statement is obtained for the class NQP. 



Finally, we consider problem SFT over the semiring of the nonnegative rational numbers. Note that, just as in the 
quantum case, the entries in the column vectors are regarded as probability amplitudes. All the gates do in a classical 
reversible array is permute the different vector components without ever mixing or combining them; no interference 
ever takes place and it does not matter in terms of the final result, whether the probabilities are represented as such or 
as amplitudes. 

Theorem 5.3. Problem SFT( Q + ) is PP-complete under logspace reducibility. 

Proof. For a generic reduction, we start with a reversible gate array C whose input is a string of N = s(n) + t(n) 



bits, where the initial s(n) bits are the ancillary bits, all set to 0, and the other t(n) bits are random. By Lemma |-kl 
C and its input can be encoded into F(C) • V, where the 2 N x 1 unit vector V specifies the inputs, i.e. a bit string 
idi • • • d t ( n ) which satisfies the conditions 
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i. Ci = for all i < i < s(n), and 

ii. all combinations of values for the random bits di • • • dt(n.) 316 equally likely. 

The corresponding entries in the vector val 2 }™' 1 (V) carry value 1 /V2 t ( TL '; all others contain 0. We demand wlog 
that t(n) be even; dealing with the random bits pairwise enables us to ensure that no irrational values are necessary. 
Then 



1/V2 



D± n 



1/2 
1/2 
1/2 
1/2 



Let the acceptance condition be that bit Ci has value 1 at the output level. This corresponds to the first 2 positions 
along the diagonal of (val 2 ™' 1 (F(C) • V)V (val 2 ^ 1 (F(C) • V) 



In the other direction, consider an instance (F, k) for SFT(Q + ). We have discussed in Section i.2 how the column 



vectors and square matrices are interpreted as "inputs" and "gates" in the equivalent array, through the construction of 
a formula FT(F] where all matrices have orders which are powers of 2. We add extra steps to the construction of TT(F) 
in order to enforce the further condition, that all fractions have a power of 2 as denominator. 

Consider a n x 1 unit vector vt = [■%-••• %-] T , where a 2 + • • • + a 2 = d 2 . Let d not be a power of 2: d < 7t(d). 



flf + 



The reader can verify that there exist integers bi , . . . , b p such that n(d)' 
p < 3 [log 2 d] . Let q = min{2 2 ' : 2 2 ' > n + 3 [log 2 d] }, and embed v into the q x 1 vector 



+ at + bi + 



+ bp and 



Qr 



0-1 



7t(d) 7t(d) 



n T 



••• 



n(d) 7t(d) 



which can be interpreted as a distribution of probability amplitudes for log 2 q input bits. Denote by Si the fraction 
d/7t(d). Repeating this process on each input column vector yields an instance (G, k) where the resulting partial trace 
is the same one obtained from (F,k), times a factor A 2 = Oi^i- ^ we accept instance (F, k) whenever the partial 
trace is above a threshold a, then there exists a probabilistic polytime Turing machine M which accepts ( G , k) with 
probability above 

The algorithm of M is divided into three phases; the first consists in building the new instance ( G , k) from the original 
(F, k), the second in choosing nondeterministically a column vector to give as input to the equivalent array C(G), and 
the third in deterministically simulating C(G) on its input. In the second step M nondeterministically selects values 
for the bits in the string di • • • d t ( n ); the preprocessing step has organized their probability distribution in order to 
ensure that this can be done with a sequence of nondeterministic binary choices, followed by a look-up into a table 
which is linear in size and is computed from the column vectors in (F, k). □ 



The reader can verify that this proof can be rewritten in terms of the promise problem SFTP(Q + ) and the complexity 
class BPP; in the second part of the proof the cutpoint and the size of the empty interval can be modified, how- 
ever. Meanwhile, the complexity of the nonzero version is obtained with a straightforward application of the above 
argument. 
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Figure 2: Summary of completeness results 



Corollary 5.4. The promise and nonzero versions of problem SFT(Q + ) are BPP-complete and NP-complete, respec- 
tively, under logspace reducibility. 

6 Conclusion 

Through the study of problem SFT, we have developed a common algebraic description for polynomial-time com- 
plexity classes, where the choice of the semiring determines the complexity class. For the inclusion chain P C BPP C 
BQP, in particular, the classical model of polytime probabilistic computation turns out to be a special case of polytime 
quantum computation where interference between computations is ruled out. 
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